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WHAT IS CLAIMED IS: 

1. Apparatus for recognizing free-field audio signals, comprising: 

a hand-held device having a microphone to capture free-field audio signals; 

a local processor, coupleable to said hand-held device, to transmit audio signal 
features corresponding to the captured free-field audio signals to a recognition site; 

one of said hand-held device and said local processor including circuitry which 
extracts a time series of spectrally distinct audio signal features from the captured free-field audio 
signals; and 

a recognition processor and a recognition memory at the recognition site, said 
recognition memory storing data corresponding to a plurality of audio templates, said recognition 
processor correlating the audio signal features transmitted from said local processor with at least one 
of the audio templates stored in said recognition processor memory, said recognition processor 
providing a recognition signal based on the correlation. 

2. Apparatus according to Claim 1, wherein said hand-held device includes: 

an analog-to-digital converter which digitizes the captured free-field audio signals; 

and 

a processor which extracts the time series of spectrally distinct audio signal features 
from the captured free-field audio signals. 

3. Apparatus according to Claim 1, wherein said local processor extracts the time 
series of spectrally distinct audio signal features from the captured free-field audio signals 
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4. Apparatus according to Claim 1 , wherein said local processor comprises apersonal 
computer coupled to the Internet. 

5. Apparatus according to Claim 1, wherein said recognition processor memory 
stores a plurality of audio templates, each template corresponding to substantially an entire audio 
work. 

6. Apparatus according to Claim 5, wherein said hand-held device has a memory 
which stores free-field audio signals which correspond to less than an entire audio work. 

7. Apparatus according to Claim 6, wherein the audio work comprises a song. 

8. Apparatus according to Claim 1, wherein said recognition processor, in response 
to the recognition signal, transmits at least a portion of the at least one template stored in said 
recognition processor memory to said local processor for verification. 

9. Apparatus according to Claim 1, wherein said recognition processor 
mathematically correlates the audio signal features transmitted from said local processor with the 
at least one of the audio templates stored in said recognition processor memory. 

1 0. A hand-held device for capturing audio signals to be transmitted from a network 
computer to a recognition site, the recognition site having a processor which receives extracted 
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feature signals that correspond to the captured audio signals and compares them to a plurality of 
stored song information, the hand-held device comprising: 
a microphone receiving analog audio signals; 

an A/D converter converting the received analog audio signals to digital audio 

signals; 

a signal processor extracting spectrally distinct feature signals from the digital audio 

signals; 

a memory storing the extracted feature signals; and 

a terminal transmitting the stored extracted feature signals to the network computer. 

1 1. A device according to Claim 10, further comprising an anti-aliasing filter for 
filtering the received analog audio signals. 

12. A device according to Claim 10, wherein said memory comprises a flash 

memory. 

13. A device according to Claim 10, wherein said signal processor extracts a time 
series of signals corresponding to energy in a plurality of different frequency bands of the digital 
audio signals. 



14. A device according to Claim 10, wherein said signal processor compresses the 
extracted feature signals, and wherein said memory stores the compressed signals. 
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15. A device according to Claim 10, wherein said hand-held device comprises a 
cellular telephone. 



16. A device according to Claim 10, wherein said hand-held device comprises a 
portable device assistant. 

1 7 . A device according to Claim 1 0, wherein said hand-held device comprises a radio 

receiver. 

18. A local processor for an audio signal recognition system having a hand-held 
device and a recognition server, the hand-held device capturing audio signals and downloading them 
to the local processor, the recognition server (i) receiving from the local processor extracted feature 
signals that correspond to the captured audio signals and (ii) comparing received extracted feature 
signals to a plurality of stored song information, the local processor comprising: 

an interface for receiving the captured audio signals from the hand-held device; 

a processor for forming extracted feature signals corresponding to the received 
captured audio signals, the extracted feature signals corresponding to different frequency bands of 
the captured audio signals; 

a memory for storing the extracted feature signals; and 

an activation device which causes the stored extracted feature signals to be sent to the 
recognition server. 
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19. A processor according to Claim 18, further comprising audio structure for 
playing back to a user a verification signal received from the recognition server, the verification 
signal corresponding to the captured audio signal. 

20. A processor according to Claim 1 8, wherein said processor forms the extracted 
feature signal from less than an entire audio work. 

21. A processor according to Claim 18, wherein the local processor sends the 
extracted feature signals to the recognition server over the Internet. 

22. A recognition server for an audio signal recognition system having a hand-held 
device and a local processor, the hand-held device capturing audio signals and transmitting to the 
local processor signals which correspond to the captured audio signals, the local processor 
transmitting extracted feature signals to the recognition server, the recognition server comprising: 

an interface receiving the extracted feature signals from the local server; 
a memory storing a plurality of feature signal sets, each set corresponding to an entire 
audio work; and 

processing circuitry which (i) receives an input audio stream and separates the 
received audio stream into a plurality of different frequency bands; (ii) forms a plurality of feature 
time series waveforms which correspond to spectrally distinct portions of the received input audio 
stream; (iii) stores in the memory the plurality of feature signal sets which correspond to the feature 
time series waveforms, (iv) compares the received feature signals with the stored feature signal sets, 
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and (v) provides a recognition signal when the received feature signals match at least one of the 
stored feature signal sets. 

23. A server according to Claim 22, wherein said processing circuitry also (i) forms 
multiple feature streams from the plurality of feature time series waveforms; (ii) forms overlapping 
time intervals of the multiple feature streams; (iii) estimates the distinctiveness of each feature in 
each time interval; (iv) rank-orders the features according to their distinctiveness; (v) transforms the 
feature time series to obtain the complex spectra; and (viii) stores the feature complex spectra in the 
memory as the feature signal sets. 

24. A server according to Claim 22, wherein said interface receives extracted feature 
signals which comprise less than an entire audio work. 

25. A server according to Claim 22, wherein said interface is coupled to the Internet. 

26. A server according to Claim 22, wherein said processor forwards to the local 
processor, verification audio signals which correspond to the matched at least one stored feature 
signal sets. 

27. A server according to Claim 22, wherein said processor forwards to the local 
processor, purchase signals which correspond to the matched at least one stored feature signal sets. 
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28. A hand-held music capture device, comprising: 

a microphone which receives an arbitrary portion of an analog audio signal; 
an analog-to-digital converter to convert the received portion of the audio signal into 
a digital signal; 

a signal processor which receives a fixed-time-portion of the digital signal and signal 
processes same into a digital time series representing the voltage waveform of the captured audio 
signal; 

a memory which stores the processed fixed-time portion of the digital signal that 
corresponds to less than a complete audio work; and 

a terminal which is connectable to a computer device and transmits the stored portion 
of the digital signal to the computer device. 

29. A device according to Claim 28, wherein the signal processor compresses the 
received arbitrary portion of the analog audio signal before storing it in said memory. 

30. Apparatus according to Claim 28, wherein said signal processor forms a time 
series signal corresponding to the energy in different frequency bands of the received analog audio 
signal. 

31. A portable device to capture and store samples of free-field audio signals and 
store these samples for later identification, comprising: 

a microphone to receive an audio waveform; 
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an analog to digital converter to convert the received audio waveform into a digital 

time series; 

a trigger to allow the user to manually initiate audio waveform reception; 
a signal processor to extract and compress spectrally distinct features of the received 
audio waveform; 

a memory to store the compressed spectrally distinct features; and 

an interface to allow transfer of the stored features to recognition equipment. 

32. A method for recognizing an input data stream, comprises the steps of: 
receiving the input data stream with a hand held device; 

with the hand held device, randomly selecting any one portion of the received data 

stream; 

forming a first plurality of feature time series waveforms corresponding to spectrally 
distinct portions of the received data stream; 

transmitting to a recognition site the first plurality of feature time series waveforms; 

storing a second plurality of feature time series waveforms at the recognition site; 

at the recognition site, correlating the first plurality of feature time series waveforms 
with the second plurality of feature time series waveforms; and 

designating a recognition when a correlation probability value between the first 
plurality of feature time series waveforms and one of the second plurality of feature time series 
waveforms reaches a predetermined value. 
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33. A method for recognizing free-field audio signals, comprising the steps of: 
capturing free-field audio signals with a hand-held device having a microphone; 
transmitting signals corresponding to the captured free-field audio signals to a local 

processor; 

transmitting from the local processor to a recognition site, audio signal features which 
correspond to the signals transmitted from the hand-held device; 

one of the hand-held device and the local processor extracting a time series of 
spectrally distinct audio signal features from the captured free-field audio signals; 

storing data corresponding to a plurality of audio templates in a memory at the 
recognition site; 

correlating the audio signal features transmitted from the local processor with at least 
one of the audio templates stored in the recognition site memory, using a recognition processor; and 
providing a recognition signal based on the correlation. 



34. A method according to Claim 33, wherein said capturing step includes the steps 

of: 

analog-to-digital converting the captured free-field audio signals; and 
extracting the time series of spectrally distinct audio signal features from the captured 
free-field audio signals. 



35. A method according to Claim 33, wherein said local processor extracts the time 
series of spectrally distinct audio signal features from the captured free-field audio signals 
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36. A method according to Claim 33, wherein said local processor comprises a 
personal computer coupled to the Internet. 



37. A method according to Claim 33, wherein said storing step comprises the step 
of storing in the recognition site memory a plurality of audio templates, each template corresponding 
to substantially an entire audio work. 



38. A method according to Claim 33, further comprising the step of storing, in a 
hand-held device memory, free-field audio signals which correspond to less than an entire audio 
work. 



39. A method according to Claim 33, wherein the audio work comprises a song. 



40. A method according to Claim 33, further comprising the step of the recognition 
processor, in response to the recognition signal, transmitting at least a portion of the at least one 
template stored in said recognition processor memory to the local processor for verification. 

41. A method according to Claim 33, wherein said recognition processor 
mathematically correlates the audio signal features transmitted from said local processor with the 
at least one of the audio templates stored in said recognition processor memory. 
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42. A method for a hand-held device to capture audio signals to be transmitted from 
a network computer to a recognition site, the recognition site having a processor which receives 
extracted feature signals that correspond to the captured audio signals and compares them to a 
plurality of stored song information, the method comprising the steps of: 

receiving analog audio signals with a microphone; 
A/D converting the received analog audio signals to digital audio signals; 
extracting spectrally distinct feature signals from the digital audio signals with a 
signal processor; 

storing the extracted feature signals in a memory ; and 

transmitting the stored extracted feature signals to the network computer through a 

terminal. 

43. A method according to Claim 42, further comprising the step of anti-alias 
filtering the received analog audio signals. 

44. A method according to Claim 42, wherein said memory comprises a flash 

memory. 

45. A method according to Claim 42, wherein said signal processor extracts a time 
series of signals corresponding to energy in a plurality of different frequency bands of the digital 
audio signals. 
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46. A method according to Claim 42, wherein said signal processor compresses the 
extracted feature signals, and wherein said memory stores the compressed signals. 

47. A method according to Claim 42, wherein said hand-held device comprises a 
cellular telephone. 

48. A method according to Claim 42, wherein said hand-held device comprises a 
personal digital assistant. 

49. A method according to Claim 42, wherein said hand-held device comprises a 

radio receiver. 

50. A local processor method in an audio signal recognition system having a hand- 
held device and a recognition server, the hand-held device capturing audio signals and downloading 
them to the local processor, the recognition server (i) receiving from the local processor extracted 
feature signals that correspond to the captured audio signals and (ii) comparing received extracted 
feature signals to a plurality of stored song information, the method comprising the steps of: 

receiving the captured audio signals from the hand-held device through an interface; 

forming extracted feature signals corresponding to the received captured audio signals 
with a processor, the extracted feature signals corresponding to different frequency bands of the 
captured audio signals; 

storing the extracted feature signals in a memory; and 
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causing the stored extracted feature signals to be sent to the recognition server. 

5 1 . A method according to Claim 50, further comprising the step of playing back to 
a user at the local processor, a verification signal received from the recognition server, the 
verification signal corresponding to the captured audio signal. 

52. A method according to Claim 50, wherein said processor forms the extracted 
feature signal from less than an entire audio work. 

53. A method according to Claim 50, wherein the local processor sends the extracted 
feature signals to the recognition server over the Internet. 

54. A recognition server method in an audio signal recognition system having a 
hand-held device and a local processor, the hand-held device capturing audio signals and 
transmitting to the local processor signals which correspond to the captured audio signals, the local 
processor transmitting extracted feature signals to the recognition server, the method comprising the 
steps of: 

receiving the extracted feature signals from the local server through an interface ; 

storing a plurality of feature signal sets in a memory, each set corresponding to an 
entire audio work; and 

with processing circuitry (i) receiving an input audio stream and separates the 
received audio stream into a plurality of different frequency bands; (ii) forming a plurality of feature 
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time series waveforms which correspond to spectrally distinct portions of the received input audio 
stream; (iii) storing in the memory the plurality of feature signal sets which correspond to the feature 
time series waveforms, (iv) comparing the received feature signals with the stored feature signal sets, 
and (v) providing a recognition signal when the received feature signals match at least one of the 
stored feature signal sets. 

55. Amethod according to Claim 54, wherein said processing circuitry also (i) forms 
multiple feature streams from the plurality of feature time series waveforms; (ii) forms overlapping 
time intervals of the multiple feature streams; (iii) estimates the distinctiveness of each feature in 
each time interval; (iv) rank-orders the features according to their distinctiveness; (v) transforms the 
feature time series to obtain the complex spectra; and (viii) stores the feature complex spectra in the 
memory as the feature signal sets. 

56. A method according to Claim 54, wherein said interface receives extracted 
feature signals which comprise less than an entire audio work. 

57. A method according to Claim 54, wherein said interface is coupled to the 

Internet. 

58. A method according to Claim 54, wherein said processor forwards to the local 
processor, verification audio signals which correspond to the matched at least one stored feature 
signal sets. 
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59. A method according to Claim 54, wherein said processor forwards to the local 
processor, purchase signals which correspond to the matched at least one stored feature signal sets. 

60. Computer readable storage media storing code which causes one or more 
processors to carry out a method for recognizing an input data stream, the code causing the one or 
more processors to perform the steps of: 

receiving the input data stream with a hand held device; 

with the hand held device, randomly selecting any one portion of the received data 

stream; 

foraiing a first plurality of feature time series waveforms corresponding to spectrally 
distinct portions of the received data stream; 

transmitting to a recognition site the first plurality of feature time series waveforms; 

storing a second plurality of feature time series waveforms at the recognition site; 

at the recognition site, correlating the first plurality of feature time series waveforms 
with the second plurality of feature time series waveforms; and 

designating a recognition when a correlation probability value between the first 
plurality of feature time series waveforms and one of the second plurality of feature time series 
waveforms reaches a predetermined value. 

61. Computer readable storage media storing code which causes one or more 
processors to carry out a method for recognizing free-field audio signals, the code causing the one 
or more processors to perform the steps of: 
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capturing free-field audio signals with a hand-held device having a microphone; 
transmitting signals corresponding to the captured free- field audio signals to a local 

processor; 

transmitting from the local processor to a recognition site, audio signal features which 
correspond to the signals transmitted from the hand-held device; 

at least one of the hand-held device and the local processor extracting a time series 
of spectrally distinct audio signal features from the captured free-field audio signals; 

storing data corresponding to a plurality of audio templates in a memory at the 
recognition site; 

correlating the audio signal features transmitted from the local processor with at least 
one of the audio templates stored in the recognition site memory, using a recognition processor; and 
providing a recognition signal based on the correlation. 

62. Computer readable storage media according to Claim 61, wherein said code 
includes code for causing the one or more processors to perform the steps of: 

analog-to-digital converting the captured free-field audio signals; and 
extracting the time series of spectrally distinct audio signal features from the captured 
free-field audio signals. 

63. Computer readable storage media according to Claim 61, wherein said code 
includes code for causing the said local processor to extract the time series of spectrally distinct 
audio signal features from the captured free-field audio signals 
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64. Computer readable storage media according to Claim 61, wherein said local 
processor comprises a personal computer coupled to the Internet. 

65. Computer readable storage media according to Claim 61, wherein said storing 
step comprises the step of storing in the recognition site memory a plurality of audio templates, each 
template corresponding to substantially an entire audio work. 

66 . Computer readable storage media according to Claim 6 1 , further comprising code 
for causing the step of storing, in a hand-held device memory, free-field audio signals which 
correspond to less than an entire audio work. 

67 . Computer readable storage media according to Claim 6 1 , wherein the audio work 
comprises a song. 

6 8 . Computer readable storage media according to Claim 6 1 , further comprising code 
for causing the recognition processor, in response to the recognition signal, to transmit at least a 
portion of the at least one template stored in said recognition processor memory to the local 
processor for verification. 

69 . Computer readable storage media according to Claim 6 1 , further comprising code 
for causing said recognition processor to mathematically correlate the audio signal features 
transmitted from said local processor with the at least one of the audio templates stored in said 
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70. Computer readable storage media storing code which causes a hand-held device 
to capture audio signals to be transmitted from a network computer to a recognition site, the 
recognition site having a processor which receives extracted feature signals that correspond to the 
captured audio signals and compares them to a plurality of stored song information, the code causing 
the hand-held device to perform the steps of: 

receiving analog audio signals with a microphone; 
A/D converting the received analog audio signals to digital audio signals; 
extracting spectrally distinct feature signals from the digital audio signals with a 
signal processor; 

storing the extracted feature signals in a memory ; and 

transmitting the stored extracted feature signals to the network computer through a 

terminal. 

7 1 . Computer readable storage media according to Claim 70, wherein the code causes 
said signal processor to extract a time series of signals corresponding to energy in a plurality of 
different frequency bands of the digital audio signals. 

72. Computer readable storage media according to Claim 70, wherein said code 
causes said signal processor to compress the extracted feature signals, and wherein said code causes 
said memory to store the compressed signals. 
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73 . Computer readable storage media according to Claim 70, wherein said hand-held 
device comprises a cellular telephone. 

74. Computer readable storage media according to Claim 70, wherein said hand-held 
device comprises a personal digital assistant. 

75 . Computer readable storage media according to Claim 70, wherein said hand-held 
device comprises a radio receiver. 

76. Computer readable storage media storing code which causes a local processor 
to transmit extracted feature signals to a recognition server, in an audio signal recognition system 
having a hand-held device and the recognition server, the hand-held device capturing audio signals 
and downloading them to the local processor, the recognition server (i) receiving from the local 
processor extracted feature signals that correspond to the captured audio signals and (ii) comparing 
received extracted feature signals to a plurality of stored song information, the code causing the local 
processor to perform the steps of: 

receiving the captured audio signals from the hand-held device through an interface; 

forming extracted feature signals corresponding to the received captured audio signals 
with a processor, the extracted feature signals corresponding to different frequency bands of the 
captured audio signals; 

storing the extracted feature signals in a memory; and 

causing the stored extracted feature signals to be sent to the recognition server. 
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77. Computer readable storage media according to Claim 76, wherein the code causes 
the local processor to play back to a user at the local processor, a verification signal received from 
the recognition server, the verification signal corresponding to the captured audio signal. 

78. Computer readable storage media according to Claim 76, wherein said processor 
forms the extracted feature signal from less than an entire audio work. 

79. Computer readable storage media according to Claim 76, wherein code causes 
the local processor to send the extracted feature signals to the recognition server over the Internet. 

80. Computer readable storage media storing code which causes a recognition server 
to recognize signals in an audio signal recognition system having a hand-held device and a local 
processor, the hand-held device capturing audio signals and transmitting to the local processor 
signals which correspond to the captured audio signals, the local processor transmitting extracted 
feature signals to the recognition server, the code causing the recognition server to perform the steps 
of: 

receiving the extracted feature signals from the local server through an interface; 

storing a plurality of feature signal sets in a memory, each set corresponding to an 
entire audio work; and 

with processing circuitry (i) receiving an input audio stream and separates the 
received audio stream into a plurality of different frequency bands; (ii) forming a plurality of feature 
time series waveforms which correspond to spectrally distinct portions of the received input audio 
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stream; (iii) storing in the memory the plurality of feature signal sets which correspond to the feature 
time series waveforms, (iv) comparing the received feature signals with the stored feature signal sets, 
and (v) providing a recognition signal when the received feature signals match at least one of the 
stored feature signal sets. 

81. Computer readable storage media according to Claim 80, wherein said code 
causes said processing circuitry to also (i) form multiple feature streams from the plurality of feature 
time series waveforms; (ii) form overlapping time intervals of the multiple feature streams; (iii) 
estimate the distinctiveness of each feature in each time interval; (iv) rank-order the features 
according to their distinctiveness; (v) transform the feature time series to obtain the complex spectra; 
and (viii) store the feature complex spectra in the memory as the feature signal sets. 

82. Computer readable storage media according to Claim 80, wherein said code 
causes said interface to receive extracted feature signals which comprise less than an entire audio 
work. 

83. Computer readable storage media according to Claim 80, wherein said code 
causes said processor to forward to the local processor, verification audio signals which correspond 
to the matched at least one stored feature signal sets. 

84. A business method of recognizing free-field audio signals, comprising the steps 

of: 
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capturing free-field audio signals with a hand-held device having a microphone; 
transmitting signals corresponding to the captured free-field audio signals to a local 

processor; 

transmitting from the local processor to a recognition site, audio signal features which 
correspond to the signals transmitted from the hand-held device; 

at least one of the hand-held device and the local processor extracting a time series 
of spectrally distinct audio signal features from the captured free-field audio signals; 

storing data corresponding to a plurality of audio templates in a memory at the 
recognition site; 

correlating the audio signal features transmitted from the local processor with at least 
one of the audio templates stored in the recognition site memory, using a recognition processor; 
providing a recognition signal based on the correlation; 

forwarding the recognition signal to a user at the local processor, together with 
instruction for the purchase of an audio work which corresponds to the at least one of the audio 
templates stored in the recognition site memory. 



85. A business method according to Claim 82, further comprising the steps of: 
receiving payment authorization from said user; and 

in response to the authorization, forwarding the audio work which corresponds to the 
at least one of the audio templates stored in the recognition site memory to the user. 
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